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Abstract 



We study the asymptotic behavior of the maximum Hkehhood es- 
timator corresponding to the observation of a trajectory of a Skew 
Brownian motion, through a uniform time discretization. We charac- 
terize the speed of convergence and the hmiting distribution when the 
step size goes to zero, which in this case are non-classical, under the 
null hypothesis of the Skew Brownian motion being an usual Brownian 
motion. This allows to design a test on the skewness parameter. We 
show that numerical simulations that can be easily performed to esti- 
mate the skewness parameter, and provide an application in Biology. 

Keywords: Ske-w Bro-wnian motion, statistical estimation, maximum like- 
lihood. 



The Ske-w Bro-wnian Motion (SBm) has attracted interest -within other facts, 
due to its relations -with diffusions -with discontinuous coefficients or to media 
■with permeable barriers, being the first example of the solution of a stochastic 
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1 Introduction 



differential equation with the local time of the solution as drift [6j: the SBm 
X = {Xf : < t < T} can be defined as the strong solution of the stochastic 
differential equation 

Xt = x + Bt + Oil (1) 

where B = {Bt: < t < T} is a standard Brownian motion defined on a 
probability space (fi, J-", P), the initial condition is x > (the case x < is 
symmetrical), 6 G [—1, 1] is the skewness parameter, and = {i^ : < t < 
T} is the local time at level zero of the (unknown) solution X of the equation 
departing from x, defined by 

?^ = lini;^ /*i(„,,,)(X,)rfs. (2) 
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In = we denote = it, being this case particularly interesting due 

to some explicit calculations that can be carried out (see Section [s]). 

In the literature the skewness parameter is sometimes defined as p = {6 + 
l)/2; this second parametrization being more convenient for an alternative 
construction of the SBm: depart from the reflected Brownian motion and 
choose, independently with probability p G [0,1], whether each particular 
excursion of the reflected Brownian motion remains positive. 

In the special case 9 = 1 (p = 1), the solution to ([T| is the reflected 
Brownian motion. The case 6 = {p = 1/2) corresponds to the the standard 
Brownian motion. 

Recently, several papers have considered the SBm in modelling or sim- 
ulation issues, as well as some optimization problems. See the review by 



A. Lejay 13 for references on the subject, as well as a survey of the various 
possible constructions and applications of the SBm. 

In this paper we are interested in the statistical estimation of 6, the 
skewness parameter, when we observe a trajectory of the process through 
an equally spaced time grid. From the statistical point of view we find this 
problem interesting because it is in certain sense intermediate between the 
classical problem of drift estimation in a diffusion, where the measures gen- 
erated by the trajectories of the process for different values of the parameter 
are equivalent (lH[l7| , and the estimation of the variance (the volatility in 
financial terms) of a diffusion (see for instance |4j, or [9| and the references 
therein), where the probability measures generated by the trajectories are 
singular for different values of the parameter. At the best of our knowledge, 
the only estimator of 6 is the one constructed by O. Bardou and M. Mar- 
tinez |T|, where they assume that the SBm is reflected at levels 1 and —1 to 
ensure ergodicity, considering a different scheme of observation of the trajec- 
tory. 
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Our main result states that the maximum hkehhood estimator (MLE) 
corresponding to the observation of a discretization of one trajectory of the 
process, with the corresponding normahzation, satisfies the so called Local 
Asymptotically Mixed Normality (LAMN) property at the point ^ = 0. With 
this result and the identification of the limiting distribution of the scaled MLE 
estimator, one may construct some hypothesis test to determine whether or 
not the Brownian is skew. This fact suggests certain asymptotic proper- 
ties of the MLE, as exposed for instance in the classical book of Ibragimov 
and Has'minskii [?]. Nevertheless, as our results in terms of convergence of 
statistical experiments are not exactly the ones needed in the hypothesis of 
general LAMN theorems, we follow a direct approach to construct the esti- 
mator and to study its asymptotic properties. This approach, that can be 
followed in rare occasions, has the advantage of clarifying the proof of the 
asymptotic properties and providing insight in the corresponding numerical 
computations. 

The rest of the paper is organized as follows. Section 2 describes the 
maximum likelihood methodology and the convergence results. In Section 3 
we describe the limit distribution. Sections 4 presents the statistical Test and 
some numerical simulations on the likelihood function. Section 5 presents an 
application to diffusion of species in two different habitats, and Section 6 our 
conclusions. Finally, in the Appendix we provide the theorems taken from 
[S] used in the proof of our main results in Section 2. 

2 The maximum likelihood estimator 

Consider the SBm X with parameter 6 E (—1, 1) defined in ([T| and the 
sampling scheme denoted by Xi := Xix/n {i = 0, ...,n), and A = T/n. 
In this section we derive the asymptotic behaviour of maximum likelihood 
estimator 6n of the parameter 6 when we observe the sample Xi, . . . 
The transition density of the SBm of parameter G [—1, 1] is given by: 



is the density of a Gaussian random variable with variance t and mean 0. 
The likelihood of the sample is given by 



qg{t, X, y) = p{t, y-x) + sgn{y)9p{t, \x\ + \y\), 



where 
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Observing that for any x,y E'R we have 

p{A,\x\ + \y\) f \xy\+xy\ f -^{xyy 

exp T = exp < 1, (3) 



go(A,x,|/) Ay V A 

(where = {\z\ + z)/2 = max(2;, 0)), we can write 

Zn{0)= n (1-^) n (1+^) n (i-^e^^ 

Xi>0 Xi<0 X^<0 

Xi+i<0 Xi+i>0 Xi+i<0 

rt-1 



n [l + Oe^^^^^ =l[{l + h{V^X,,^/^X,^^)) 



Xi>0 i=0 
Xi+i>0 



where 



h{x,y) = sgn{x + y)exp (-(2/r)(x(x + y))+) , 

to see that Zn{6) is a polynomial of degree n, with n real roots. Remember 
that we assume Xq = x > 0. In case the trajectory we observe does not hit 
the zero level, we obtain 



n-l 



and Zn{0) is increasing in 6. In this case our maximum likelihood estimator 
is 6n = 1. In the case that the trajectory crosses the zero level, we see that 
the polynomial has roots ^ = ±1 (for large enough n), and no roots inside 
this interval. As Zn{0) = 1, this gives a unique maximum at the point On in 
the interval (—1, 1). 

Our main result is the weak convergence of the MLE to a distribution that 
we characterize. Three main differences can be noted in respect to the clas- 
sical statistical situation: (i) the convergence of the estimator is more slowly 
(ri^/^) than in the classical case; (ii) the limit is not Gaussian, but a mixture 
of Gaussian random variables; (iii) the convergence is stable, stronger than 
the usual convergence in distribution, but natural in this context, known as 
local asymptotic mixed normality (LAMN) in the literature (see for instance 
[l2]). We also have to take into account, in accordance to our previous dis- 
cussions on the existence and the value of the MLE, that the event that the 
trajectory hits the level zero is crucial in the results we obtain (in fact, if 
the trajectory does not hit this level, the MLE remains constant for all n). 
Consider then the events 

An = {uj: inf Xi < 0}, A = {io: inf Xt{uj) < 0}. (4) 

l<i<n 0<t<T 



4 



As X is continuous, 1a„ — > 1a a.s. (1^ stands for the indicator of the set 
B). We now review the stable convergence (see |10]), and introduce the 
conditional stable convergence that will take place in our case. 

Definition 1. Consider a sequence of random variables Y,Yi,Y2, . . . defined 
on a the probability space {Q,J-',F), and a a-algebra Q d J^. 

We say that the sequence of random variables Yi, Y2 • • • converge ^-stably 
in distribution to F, and denote 

g-stably 

-in ' ^ 

n— >oo 

when 

E(z/(r„)) — ^E(z/(r)) 

for any bounded Q measurable random variable Z, and any bounded and 
continuous function /. 

Furthermore, consider a sequence of sets A, Ai, ^2, . . . . We say that the 
sequence of random variables Yi, 1^2 • • • conditional on An converge ^-stably 
in distribution to Y conditional on A, and denote 

Yn\An^^Y\A, 

when 

E I An) > E {ZfiY) I A) 

for any bounded Q measurable random variable Z, and any bounded and 
continuous function /. 

We are now in position of presenting our main result. Indeed, this theorem 
will be an immediate sequel of Theorem |2] below. 

Theorem 1. Consider a Skew Brownian motion defined in ([T| with the sam- 
pling scheme described in the beginning of Section and the events An and 
A defined in Q . Then for the maximum likelihood estimator On we have the 
convergence 

1/4/1 I A T-stably^ W{i^) 
i On /In r — /I, 



n 



n—>-co 



under the Brownian motion distribution ( that is when 6 = 0), where W = 
{Wt'. t > 0} is a standard Brownian motion independent of B. In particular, 

when a; = 0, we have 

1/4 „ T-stably W{i^) 



n 



-/X. (5) 



5 



2.1 Some results on derivatives of the log- likelihood 

In order to study the asymptotic behaviour of 6*^, the MLE, we consider the 
log-likelihood, defined by 

n-1 

L4e)=logl[qe{A,X„X,+,) (6) 

i=0 

and introduce its scaled (for notational convenience) k-th derivatives, for 
A; > 1, by 

that are computed as 



( sgn(X,+i)V(A, \Xi\ + 

An analytical development of Ln\0) holds around 0: 

+00 

fc=0 

Condition ([s]) implies that |Li'^''(0)| < n and thus the series in ([s]) is absolutely 
convergent for \9\ < 1. 

Introduce, for /c = 1, 2, . . . , the sequence of functions 

hk{x,y) = [sgn(x + exp {-{2/T){x{x + y))^)]^ . 

We can then rewrite Lii^\o), for A; = 1, 2, . . . , as: 

n-1 
1=0 

We then see that the study of the limit behaviour of this type of sums, pre- 
sented in the next proposition, can be directly obtained from results obtained 
by J. Jacod fs]. (For convenience, we present Jacod's results from fs] in an 
Appendix) . 

Proposition 1. Assume that 6 = in ([T|, i.e. let X be a Brownian motion 
on [0, T] departing from x, and let denote its local time at zero. 
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(a) Assume that k = 2,4, 



-2 



1 + 



Denote 
1 



1 



exp 



Then 



2k 

Ln ^ (0) prob. 



f2k{k-l)x^ 
V {2k -If 



(^i—x) dx. 



n 



1/2 



(9) 



(10) 



(b) Assume that k = 1,3,.... Denote 



1 + 



Ak 



fAki2k-l): 
T^"P \ {4k -ly 



$(— x) dx. 



Then, there exists a Brownian motion W independent from B such that 

Ln^ (0) r -stably 



1/4 



n 



Remark 1. Observe that on the event 



^^^kW{tT)■ 
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{uo: inf XAu) > 0} 

0<t<T 

we have i^{u) = W {i^{u)) = 0. In this situation, as all the information 
about the relevant parameter 6 is produced when the process hits the level 
zero, no statistical inference can be carried out. Observe that in case x = 
we have 



P 



u: inf Xi > 

0<t<T 



0. 



Remark 2. Indeed, the results of J. Jacod could be applied to multi-dimensional 
statistics. This way, we obtain the joint J-'-stable convergence of any vector 
n~-^/'^{Ln\o), . . . ,Ln''^^\o)) for any integer k, and then the joint J-'-stable 
convergence of 

(n-VUW(0),n-V24^)(0), . . . ,n-V^Lf +i)(0),n-V^Lf +^)(0)) 

(/ilW^(£^),/X2^^, . . . ,/i2fc+lW^(^^),/i2+2^^). 

n— >oo 

Proof. We apply Theorem [3] in the Appendix. Observe that 

hk{x, y) < exp (— + y))~^) < exp {\y\ — |x A . 



We then have that (26) holds with a = 1, h{x) = exp {—\x A x^\) and r = 0, 
then it holds for any r > 0. In consequence, by the aftermentioned Theorem, 
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the convergence in (27) holds for h = hk with /c = 2,4, .... It rests to 
compute the constant in (28). We have 

I* I* l*CO l* — X 

c{hk) = hk{x,y)p{l,y)dxdy = 2 dx p{l,y)dy 
J Jr^ Jo J-oo 

+ . / dx \ exp ( ~2^^ ~ Ikxy — 2kx^ j dy 



oo 



2 / ^(-x)dx 
Jo 

2 

+ 



^27r 

r>00 



J dx exp (2k{k — l)x'^) J exp ^— -(y + 2/cx)^^ dy 

2/ ^{-x)dx + 2l exp{2k{k-l)x'^)^{-{2k-l)x)dx 
Jo Jo 

2 *(-.) exp (^||^) *(-) (12) 



Taking into account that yU^ = —c{hk) we conclude that (10) holds with /i^ 
given in (|9|). 



To prove (b) we rely on Theorem |4] in the appendix. Observe then that 
c(/ifc) = for odd k due to the property 

hk{-x, -y) = -hk{x, y) for odd k. 



In view of the fact that (|26j) holds for all hk with r = 4, taking into account 
that {hkY = h2k, we conclude that fik = c(/i2fc)- In view of the the compu- 
tations in ( [l2| ) with 2k instead of k we conclude ( [lT| ), and the proof of the 
proposition. □ 

Remark 3. On the event An, the discrete path has crossed the origin. Hence, 
the continuous path did so at a random time r. Using the strong Markov 
property, this implies that one may consider a path starting from for any 
time t > T. The local time of the Brownian motion is equal in distribution 
to the maximum of the Brownian motion. Hence, on An and A, > for 
any time t > t. 



Corollary 1. In the conditions of Proposition^ for k = 0,1,2, ... , we have: 



prob. /i2fc+2 

L(2)(0) n-^oo^ ' L(2)(0) n^oo^ /i2 ^' 



(13) 



nV^^i!^ I An ^ '-^^^^ I A, (14) 

L(2)(0) ' n^oo /i2£| ' ' ^ ^ 



In particular, as H2 = — A^i; we obtain 



_,,/.£;^i^„£±^^Epi^, (15) 



L(2)(0) 



Proof. We begin by the second part in (13). As > on the set A, we have 



_ ^-l/2^(2A.+2)(o) ^^^^ /^2A^^ _ /i2fc+2 

Assume now that cj G A'^. We have a(u;) = mio<t<T Xt{^) > a.s. on A'^. 
Now 



|^(2fc+2)^Q)| < ^2k + l)\ sup e^-^'^^/^^^^^^+^lL, 



(2) 



l<j<n 



< i2k + l)!e(-(^'=-2W^)"(-)'|Lf (0)1, 



proof of the first part of (13) 



what gives the second part of (13) on the set A'^. We postpone by now the 



Let us then verify (14). We first prove that 



Ln\0) Lj ^ ■'(0) \ J--stably , U/(px\ •, N 

^1/2 > ^^lyi y^An I > (/i2^r,/i2fc+ll^(^TJ> -l-Aj 

This amounts to prove that, for Z > 0, J-'-measurable and bounded, and real 
A, yU and z/, we have 



5„ := EZ exp[z\ X—^ + /i ^.^^ + 



^ EZ exp {A/i2^^ + mfc+iVr(^T) + J^Ia}) =: 5- (16) 

We know that 



I prob 

n 



I^,1aJ ^(/i2^^,lA) 
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as 1a — ^ 1a a-s. We then have 



EZ exp i < A 



(2fc+l), 



n 



1/2 



n 



1/4 



+ iyl 



An 



r(2fc+i)/QX 



1/4 



+ 



EZ exp i < A/i2^T + A* 



1/4 



+ 1^1, 



exp M s A 



+ EZ 



EZexp {A/i2£^ + /i/i2fc+iVr(£^) + uIa})] 

r(2)/'n\ 

" ^ ^ ' ^ ' exp(HA/i2£^ + z/lA}) 

- exp (i/i/X2fc+iVr(£^)) 



1/2 



1^1, 



exp 



n 



1/4 



->0, 



concluding the proof of (16). The proof of (14) follows with the help of the 
continuous and bounded function fxif) = tl{\t\<K} + Kl^t^x} — Kl^t<-K}- 
We have 



/ /i2fc+iW^(^: 



V At2^f' 



for all i^' > 0, and, as the limit is bounded in probability, we obtain (14). 

In what respects the first part of (13) the computation on the set A'^ is 
similar to the previous one. In the set A, we have 

L(2fc+i)(o) 



n 



L(2)(0) ■ 

as the expression within brackets has weak limit. 



□ 



2.2 A simple estimator 

The MLE is the point On at which 9 Ln{9) reaches its maximum, i.e. On 
is the (unique in our case) root of the equation 

Let us set 

i/^ Ln (0) 
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From Corollary [T] formula (15), a„ is known to converge J-'-stably as n ^ oo. 

Below, we will see that n^^^On and n^^^an have the same limit, which 
yields Theorem [1} 

The value an/n^^^, which is pretty simple to compute from the data, 
specially in contrast to ft„ that requires a numerical solver to be computed, 
can be used as an estimator of the skewness parameter. 

Let us also remark that a„ is chosen so that the first two terms in the 
Taylor series ([s]) of Ln cancel out. 



2.3 Asymptotic development of the MLE 

We then prove a theorem and a theorem and a proposition which enclose 
Theorem [H 

Theorem 2. For any integer p > 0, there exists a vector {dn \ . . . , dn~^^^) of 
random variables given the recursive relation dn^ = 1 and 

m+l r{fe+l)/p,\ 

= - E T(3vlr E (18) 

k=l lUj l<ii,...,jj.<m 
iiH \-if.=m+l 

that converges J-'-stably conditioning to A to a vector (d^^\ . . . , d^^'^^'^) de- 
pending only on and W{i^). Besides, for any e > 0, there exists some 
integer uq large enough and some K such that 



P 

where 



< t for any n > Uq, 



In addition, dn'' converges to and n is bounded. 

We prove this theorem after the next proposition, which will be stated 
in the following framework: Using the result of Remark |2j we consider the 
asymptotic behavior of the vector 

(n-/^L«(0),n-V^Lf(0),...,n-V2Lr)(0)) 

for some A; > 1. We may then consider a probability space (fi,^-', P) such 
that this sequence is equal in distribution to a sequence converging almost 
surely to {^iW{i^), ■ ■ ■ ^ I^'ik^'r)- consider some point in this 

probability space such that It > 0. If the starting point is 0, then the event 
{£r > 0} is of full measure. 
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Proposition 2. On the probability space (fijj-", P) above, the random se- 
quences dn^ given by (18) are convergent and bounded in n. Besides, for 
m = 1, 2, 3, . . . , 



(7„ , a„ ^2/4 ^ "n „3/4 ^ ^ „»n/4 ^ ^ 



1 



^(m+l)/4 



almost surely in the event {i^ > 0}. 

Let us start by a simple lemma to get a control over the finite Taylor 
expansion of Ln\0). 

Lemma 1. For any 9 and any integer m > 1, we have that for a random 
constant C such that 



A:=0 



< sup . < ^^1/2 _ 



\e 



m+l 



1 - \e\)^+^' 

(19) 



Proof. With g and for 9 G (-1, 1), 



(20) 



with ([T]), since L^f (0) /n^/^ converges in probability (either to yUfeffi or to 
depending if k is even or odd), there exists a random constant C 



\L^n\d)\ < 



n 



1/2^ 



'l-\9\f 



Hence 



A:=0 



< sup i4"+2)(e)i-i^r+^ 

m\o\ 



With (10) and (11), this gives (19) because L„ '{0)/n^^'^ is bounded in n. □ 

Lemma 2. For n large enough, the function Ln\9) is invertible. Besides, 
the function {Ln\9))~^ is Lipschitz in 9 with a constant 8/n^^'^fi2(^T 
event An. 



Proof With (20), 



p(A,|x,| + |x,+ii; 

- go(A,X,,X,+i)2 / (1 + 1^1)2 - 4 



1 ^ L^^'^O) 
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Since Ln\o) < for large enough as n ^^'^Ln\0) converges in probability 
to some negative random variable (see ((10|), we get that Ll^\e) is one-to-one. 
With the formula de{L^^\9))-^ = 1 / L^{L^n\e)) , {Ll^\e))-^ is Lipschitz in 
9 with constant 4/Li^^(0). □ 

The idea of the proof is then the following: We construct an of estima- 
tor 0„ such that for some constant C and p > 0, 

supn^'/^|L«(e„)| <C. 

neN 

Since L^^^^n) = 0, 

8 8C 

Proof of Proposition For the sake of simplicity, let us set q = v^l^. 

Set 6„ = anq + /3ng^ + Tn^^ + in(f for some /3„, 7^ and ^„ to be carefully 
chosen. Here, we consider only the first terms in the development of It 
is easily to convince oneself that this method may be applied to any order 
and that the involved terms /3„, 7„, . . . may be computed recursively and 
gives rise to (18). 

With (19) and m = 4, there exists a constant C such that 



|5 



- - Ll'\o)el - 4^)(o)e^| < Cn'^'j^^^^,. (22) 

Remark that lI!\o) - Li^^(0)a„g = 0. In order to get rid of the terms in g , 
set 

^-l!?(o) , 

L.?'(0) " 

Since an converges and n V4L(f)(0)/Li')(0) also converges stably, then ?7.^/^/3„ 
converges stably. Also, /3„ converges to 0. 
In order to get rid of the terms in g^, set 



From Corollary [TJ 7„ converges stably since a„ and (in converges stably. 
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In order to get rid of the terms in g^, set 



Again, S,n converges thanks to Corollary [Tj 
Hence 

20 

r=5 

where i?„(e„) < n^l'^\Qn? / {l-\Qn\f and B^n^ are terms that depend linearly 
on L[^\q) and on the power of the a„, 7„ and Since the L^n\^) /n^^"^ 
are bounded, we obtain that the n^^'^Bn '' are bounded. 

In addition, n^/^0n is bounded in n, so that n^^^Rn{Qn) is bounded in n. 
With (21), this proves that for some constant K, 

|Bn — 6n\ < 



n 



5/4' 



This result may be generalized to any order. Finally, let us note that /5„ 



alSn^ with (0 = -L^n\0)/L^n\0). With ([T3|, d^n^ converges in probability 



to and n^/'^dn^ is bounded in n. 7„ = a^dn^ and ^„ = a^dn^ where dn^ 
and dn'' are bounded in n. □ 

Proof of Theorem^ Let us consider the event > 0}. It corresponds to 
the event A as the local time of the Brownian motion is positive just after 
having hit 0. Since on this event, the local time has a density (see Lemma [S] 
below) which is derived from the one of the first hitting time of a point x, for 
each e > 0, one may find a set ^(e) as well as some values < a' < 6' and c' 
such that u G f2(e) implies that G {a',b') and \W{iT{x))/i^\ < c' and 

F[n{e)\{i^ > 0}] > 1 -e/2. 

From the joint convergence of the n^^'^Ln''^^\o) to /U2fc+iW^(^f') ^^"^ ^^e 
joint convergence of the n^^'^Ln''\o) to H2k^T^ S^^ f*^^ ^^^J e > 0, there 
exists < A; < a' and K > b' as well as a measurable set Q'{e, n) C f2(e) such 
that 

L(f)(0) >k^/n on 
and Vn > no, P[r]'(n, e)|{£^ > 0}] > 1 - e. 
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In the proof of Proposition |2} we constructed some estimator 0„ such 
that for some p > 0, nPLn\Qn) is bounded by some constant depending the 
upper bounds of the Besides, we use the Lipschitz constant of 

(Ln '')~^ which depends on the lower bound of 'n}/'^Ln\^). Thus, on fi'(e, n), 
we obtain that \6n — Qn\ < CjnF^^I'^^ where C depends only on K and fc, 
assuming that n > tiq. This means that 

Thus, for any e > 0, there exists no large enough such that 

Wn > no, PK+i/2|0„ - e„,| > C] < e. 
which yields the result. □ 



2.4 The contrast function 

In order to study the maximum likelihood, it is also possible to consider the 
contrast function 

expjLnje)) 
exp(L„(0)) 

Using the asymptotic development of Ln{u) around 0, we get that 

iogz„(^^) = eLil\o) + ^42)(o) + 0(6% 

Thus, with the result of Proposition [T] and taking into account that fi2 = — /^i, 
we see that 



iogz„(^/n^/^) /ii (ewit^) - ^-^i^) . 



(23) 



From this convergence we can intuitively check our result in (|5|, based in the 
theory of convergence of statistical experiments and the LAMN property in 



(23). The theory states (under certain stringent conditions that we do not 
verify) that the maximum likelihood estimator of the pre-limit experiments 
converges stably to the maximum likelihood estimator of the limit experiment 
[T]. It is direct, differentiating with respect to 9 in the r.h.s. of (23), to obtain, 
when £^ > 0, that the MLE in the limit experiment is W{i^)/£^. We then 
obtain ([S]) in the form 
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3 The limit distribution 

As n^^^9n and n^/^a„ converge to T = W{i^)/i^, we give the main charac- 
teristics of this random variables. To simphfy the computations, we assume 
that X = and T = 1, so that we write ii = 

Indeed, this random variable is easy to simulate. 

Lemma 3. The distribution of T is symmetric. Besides, its density is 

dFr(x) /" ^ (-xy 

and it is equal in distribution to 

T = with H=]-{U + ^V + f/2), (25) 

where G{H), U and V are independent random variables, G{H), G{H) ~ 
M{0,H), [/~Ar(0,l) andV ^ exp{l/ 2). 

Proof. It is well known that the local time ii at time 1 is equal in distribution 
to the supremum of the Brownian motion sup^g^,!] on [0, 1]. It follows that 

Fi,{y) = <y]= Po[ sup 5, < y] = Po[r, > 1], 

■re [0,1] 

where Ty = mi{t > \ Bf = y}. The density v{t;y) of Ty is equal to 



so that 



F.(y) = 1-1 — ^exp (-— ] dt, 



and the density /^^ {y) of ii is then equal to 





Thus, conditioning with respect to the value of ii, 

"+00 

Fr{x) 



r+oo 

P[T<x]= / nW{y)<xy]f{y)dy 
Jo 



and this leads to (24). 

Expression (25) follows from the equality in distribution of ii and + 
VV + U^). This expression has been used in order to simulate the reflected 
Brownian motion [l5|[l6) . □ 
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The variance of T is 3.16. We see in Figure [TJthat the density of T is 
close to that of the normal distribution, yet narrower. 



o 



o 

C) 

IT) 

o 
<=> 

o 

C) 



o 
o 

C) 



Figure 1: Density of T (solid) and density of the normal distribution with 
variance Var(T) (dashed). 

4 Numerical tests and observations on the like- 
lihood 

Numerical tests are easy to perform, as all the formulae are easy to imple- 
ment. 

4.1 On the coefficient an 

Several tests can be performed on a-n = —n ^/'^Li^\0)/L^n\0), mainly to see 
whether it is reasonable to use it instead of the MLE 

First, one can check that 9n and dnl^t}!^ are pretty close, by setting 
Qn = argmaxgg(_]^^-) Ln{6) and computing it using a numerical procedure. In 
Table [l| one can check that the error of \6n — an/n^^'^l is of order so 
that an/n^^^ is a pretty good approximation of 9n, and is much more faster 
to compute. 

Second, one can check the variance of 

o^Yi 5 as well as the adequacy of a„ 
with the distribution of T = W{£i)/ii. For this, we have used a set of 10,000 
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n mean 7^1/2 ^ mean n^^^x mean std dev quant. 90 % 



100 


0.026 


0.26 


0.8 


0.057 


0.082 


200 


0.028 


0.40 


1.5 


0.083 


0.057 


500 


0.013 


0.29 


1.3 


0.055 


0.026 


1,000 


0.013 


0.41 


2.3 


0.040 


0.033 


2,000 


0.006 


0.26 


1.8 


0.025 


0.015 


5,000 


0.006 


0.42 


3.5 


0.041 


0.006 


10,000 


0.002 


0.20 


2.0 


0.005 


0.003 



Table 1: Statistics of |6'„ — an/n^^'^\ over 100 paths. 



n 


T) 


p- value 


D„{an, G) 


p- value 


100 


0.020 


0.30 


0.082 


<2 X IQ-^^ 


250 


0.017 


0.005 


0.079 


<2 X 10"^*^ 


500 


0.024 


1 X 10-5 


0.083 


<2 X 10'^*^ 


750 


0.029 


9 X 10-^ 


0.087 


<2 X 10'^^ 


1,000 


0.012 


0.098 


0.072 


<2 X 10'^^ 


2,500 


0.023 


4 X 10-^ 


0.085 


<2 X IQ-^^ 


5,000 


0.021 


2 X 10-^ 


0.079 


<2 X 10-16 



Table 2: Kolmogorov-Smirnov test on an against T ^yYai{an) / Var(T) and 
the normal distribution G with variance Var(a„) over 10,000 paths. 



simulations of T, and we have renormalized T to get the same variance as 
Using a Kolmogorov-Smirnov test, we can see in Table |2] that even for a low 
value of n {e.g. n = 1000), we get a good adequation with the distribution 
of T. Yet, for n = 100, the distribution of an/n^^^ or 6'„ (by keeping only 
the values in (—1, 1), which means 88% of the values of q;„ with n = 100 and 
96% for n = 1,000) is in fact close to the Gaussian distribution. 

However, the variance of a„ is dependent on n and is not stable with n. 

In addition, for small values of n, there are some values of an such that 
Oin/n^^^ is outside [—1, 1]. 

4.2 On the order of convergence 

One could wonder if the rate of convergence of 6'„ is really of order —1/4. 
Numerical simulations show that the rate of convergence, for n in the range 
100 to 100,000 is of order 6 with 6 ^ —0.18, which is smaller than —0.25. 
This value is found using a regression on the logarithm standard deviation 
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T 

-0.5 0.0 0.5 -0.5 0.0 0.5 



(a) 6n and a„ against normal, n = 100 (b) ^„ and a„ against T, n = 100 




-0.5 0.0 0.5 -0.5 0.0 0.5 



(c) 9n and a„ against normal, n = 1000 (d) 9n and a„ against T, n = 1000 

Figure 2: Quantile-Quantile plot of 9n (solid) and an/n^^^ (dashed) against 
the normal distribution (left) and the distribution of T (right) with variance 
Var(^„) for 10,000 samples. 
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of 500 samples of On (See Figure [s]). 




Figure 3: Logarithmic regression on the standard deviation of 6. 



Indeed, one can note that the variance of a„ also depends on n, and in the 
range from 50 to 600,000, a numerical study of 10,000 samples of shows 
that Var(a„) seems to be equal to Cn^ with f3 ~ 0.08. This has to be taken 
into account in order to design some test of hypotheses. 



n I I I I I r 

Oe+OO le+05 2e+05 3e+05 4e+05 5e+05 6c+05 




Figure 4: Variance of a„ as a function of n in the linear and logarithmic 
scale. 
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4.3 A hypothesis test 



It is then possible to develop a hypothesis test of 6* = against 6 0. For 
this, let us compute 

P 

Of course, the second type error cannot be computed, and we do not know 
the asymptotic behavior of Ln{6) when 6' 7^ 0. However, it is rather easy to 
perform simulation and thus to get some numerical information about the 
MLE 6n and a„. For example, we see in Figure |5] an approximation of the 
density of an/n^^^ for 9 = 0.5 compared to an approximation of the density 
of dnlTi}!^ for the Brownian motion with n = 1,000. We can note that the 
histogram of an/n^^* has its peak on 0.5. 




-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 



Figure 5: Histogram of the density of a^/n^/^ for n = 1,000 realizations of 
the SBm with 9 = 0.5 against the an approximation of the density of an/n^^^ 
for the Brownian motion. 



\9J > 



K 



n 



1/4 



P 



ITl > 



K 



cn 



1/4 



5 An example of application: diffusion of species 

As endowed in the introduction, the SBm is a fundamental tool when one has 
to model a permeable barrier. In addition, it appears when one writes down 
the processes generated by diffusion equations with discontinuous coefficients 
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in a one dimensional media: this issue is presented in the survey article [13) 
with references to the articles where the SBm arised and covering various 
fields, such as ecology, finance, astrophysics, geophysics, ... 

We present here a possible application to ecology of our hypothesis test, 
which can be surely applied to other fields. 

5.1 Has a boundary between two habitats an effect? 

Diffusions are commonly used in ecology to explain the spread of a specie, 
at the level of individual cells (See for example the book |2J) or the level of 
an animal in a wild environment. 

Several authors have proposed the use of biased diffusions to model the 
behavior of a specie at the boundary between two habitats [3|[l9]), when the 
species diffuse with different species at speed in each habitat. 

Now, consider a situation where the dispersion of a specie in two different 
habitats is well modelled by a diffusion process, and that the measurement of 
the diffusion coefficient give the same value. Does it means that the boundary 
has no effect on the displacement of the individuals? 

Let us apply this in a one- dimensional world, where one habitat is [0, +00) 
and the other is (— cxd,0]. We assume that we may track the position of an 
individual, whose displacement in each of the habitat is given by a; + aBt. 

Then, we may apply our hypothesis test to determine whether or not the 
position shall be modelled by 

(Ho) Xt = x + aBt 

or by 

(Hi) Xt = x + aBt + ef,{X). 

Under Hypothesis (Hq), the boundary has no effect and is not seen. Under 
Hypothesis (Hi), the individual is more likely to go in one of the two habitat, 
depending on the sign of 6. 

5.2 What is the underlying operator? 

Now, let us consider that we have a measurement of the diffusion coefficients 
that gives two different values a+ on IR+ and a_ on ]R_. 

One may then wonder which differential operator shall be used to model 
the diffusive behavior. For a = a+l[o,+oo) + '2-l(-oo,o)) is it 

L = ^V(aV-) OT A= ^aA? 
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On (0, +00) and (—00, 0), there is no difference between tliese two operators, 
wfiicfi means tliat tlie local dynamic of the particle/individual is not affected 
by the choice of L and A. However, the difference arises at 0: the process X 
generated by L is solution to 



Xt = x+ [ ^MXsjdBs 
Jo 



0+ + a_ 

while the process Y generated by A is solution to 



X 



for a Brownian motion B (See for example 13,14]). From the analytical 
point of view: the domain Dom(y4) of A contains the functions of class C^(]R) 
which are bounded with bounded, first and second order derivatives. The 
domain Dom(L) of L contains functions of class C^(]R \ {0}) with bounded 
first and second order derivatives which are furthermore continuous at 0, 
and such that a+V/(0+) = a_V/(0— ). This condition is called the flux 
condition. In many physical situations, it is assumed that the flux aVu is 
continuous and this is why divergence-form operators of type L arise. 

Remark 4. Both L and A can be embedded in a single class of operators of 
type |V(aV-). If p and a are constant on (0, +00) and (— oo,0), then we 
may use the following characterization: let us consider 

1 



C = -V(aV-) with a = a+l[o,+oo) + a-l(-oo,o) 



and 



DomiC) = {fEC\R\{0}) 



/, /', /" are bounded on M \ {0} 
/(0-) = /(0+) 

(l + A)/'(0+) = (l-A)/'(0-), AG (-1,1: 



This class of operators is then specified by three parameters, a+ > 0, a_ > 
and A G (—1, 1). The operator A corresponds to A = 0, while L corresponds 
to A = (a+ — a_ ) / (a+ + a_ ) . 



For = dx 



aix) 



X = $(X) is solution to the SDK 



13 



14 



Xt = ^x)+Bt + 
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while Y = is solution to the SDE 




We then see that both X and Y are Skew Brownian motions, but the coeffi- 
cients in front of their local time have opposite signs. 

Even if we have not studied the asymptotic behavior of the MLE for the 
SBm with skewness parameter different from 0, numerical experiments back 
the following hypotheses test: 

1. Given an observed X, estimate the diffusion coefficient for the process 
on each side of 0. 

2. Apply the function $ to the observed process. 

3. Compute the MLE On of the Skewness parameter. If a+ > a_ (resp. 
a+ < a_) and 6'„ > then decided that the infinitesimal generator of X 
is L (resp. A). Otherwise, decide that it is A (resp. L). 

6 Conclusion 

In this article, we have studied the behavior of the maximum likelihood for 
the Skew Brownian motion when the parameter to estimate is 0. 

In particular, we have shown that the rate of convergence of the esti- 
mator On is n^^^ and not n^/^ as in the classical case. This should not be 
surprising: indeed, away from 0, the Skew Brownian motion behaves like 
a Brownian motion, and only its dynamic close to allows one to see the 
difference between a Skew Brownian motion with a parameter ^ 7^ and a 
Brownian motion. It is also not surprising that the local time enters in the 
limit distribution. 

The case 7^ remains open. One needs to prove results similar to 
the one of J. Jacod |8|, when the Brownian motion is replaced by the Skew 
Brownian motion (its distribution with respect to the Wiener measure is 
singular). Of course, one cannot expect the limit law to be symmetric. Yet, 
it is pretty easy to simulate the Skew Brownian motion and to estimate the 
maximum likelihood, so that numerical studies are easy to perform. 



the main results in Section 2. We slightly change the notation and present 



7 Appendix 



In this Appendix we provide the theorems given in 



|8j used for the proofs of 
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the results in the particular cases that are relevant to us in the present work. 

Denote by X = {X^: < t < T} a Brownian motion on a probability 
space (n, J-", P). Introduce a Borel function /;,: — )■ M such that there exist 
a e M and /i: M ^ M such that 



h{x,y) <e''^y^h{x) and j \x\''\h{x)\dx < oo. (26) 
Theorem 3 (Theorem 1.1 p. 508 in [8j). Consider h as above, satisfying 



(26) with r = 0. Then 

^ n— 1 

^ V h{y/^Xi/n, V^(X(,+i)/„ - Xi/n)) c(/l)£^, (27) 

i=0 

c(/i) = // h{x,y)p{l,y)dxdy, (28) 
and £^ denotes the local time of X at level zero. 



Remark 5. It must be noticed that the convergence in (27), as stated in 8 



is stronger, in the sense that both terms in (27) are processes (i.e. depend 
on t) and the convergence is locally uniformly in time, in probability. Recall 
that a sequence (Z")„>i of processes is said to converges locally uniformly in 
time, in probability, to a limiting processes Z if for any t G the sequence 
sup^<j \ Z^ — Zs\ goes to in probability. 

Theorem 4 (Theorem 1.2 p. 511 in |8|). Consider h as above, satisfying 



(26) with some r > 3, and assume that c{h) = (see (28) / Then 

n-l 

nV4 



-ij KV^X^/n. v^(X(,+i)/„ - X,/„)) " '^^S c {h') (29) 



j=0 



where W = {Wt: t > 0} is a Brownian motion independent of X , and is 



the local time of X at level zero. The constant c{h'^) is given in (28) for the 
function h"^ . 

Remark 6. As in the previous remark, the Theorem stated in (s) is stronger. 



now in the sense that both terms in (29) are processes, and the processes 



converge stably in distribution in the Skorokhod space of cadlag functions. 
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